Exploiting i-vector posterior covariances for short-duration language recognition
نویسندگان
چکیده
Linear models in i–vector space have shown to be an effective solution not only for speaker identification, but also for language recogniton. The i–vector extraction process, however, is affected by several factors, such as noise level, the acoustic content of the utterance and the duration of the spoken segments. These factors influence both the i–vector estimate and its uncertainty, represented by the i–vector posterior covariance matrix. Modeling of i–vector uncertainty with Probabilistic Linear Discriminant Analysis has shown to be effective for short-duration speaker identification. This paper extends the approach to language recognition, analyzing the effects of i–vector covariances on a state–of–the–art Gaussian classifier, and proposes an effective solution for the reduction of the average detection cost (Cavg) for short segments.
منابع مشابه
Incorporating uncertainty as a Quality Measure in I-Vector Based Language Recognition
State-of-the-art language recognition systems involve modeling utterances with the i-vectors. However, the uncertainty of the i-vector extraction process represented by the i-vector posterior covariance is affected by various factors such as channel mismatch, background noise, incomplete transformations and duration variability. In this paper, we propose a new quality measure based on the i-vec...
متن کاملمقایسه روش های طیفی برای شناسایی زبان گفتاری
Identifying spoken language automatically is to identify a language from the speech signal. Language identification systems can be divided into two categories, spectral-based methods and phonetic-based methods. In the former, short-time characteristics of speech spectrum are extracted as a multi-dimensional vector. The statistical model of these features is then obtained for each language. The ...
متن کاملExploiting Phone Log-Likelihood Ratio Features for the Detection of the Native Language of Non-Native English Speakers
Detecting the native language (L1) of non-native English speakers may be of great relevance in some applications, such as computer assisted language learning or IVR services. In fact, the L1 detection problem closely resembles the problem of spoken language and dialect recognition. In particular, log-likelihood ratios of phone posterior probabilities, known as Phone LogLikelihood Ratios (PLLR),...
متن کاملKU-ISPL Language Recognition System for NIST 2015 i-Vector Machine Learning Challenge
In language recognition, the task of rejecting/differentiating closely spaced versus acoustically far spaced languages remains a major challenge. For confusable closely spaced languages, the system needs longer input test duration material to obtain sufficient information to distinguish between languages. Alternatively, if languages are distinct and not acoustically/linguistically similar to ot...
متن کاملI-Vector/PLDA Variants for Text-Dependent Speaker Recognition
The i-vector/PLDA approach currently dominates the field of text-independent speaker recognition and the question of how to translate this methodology to the text-dependent domain has recently become an active area of research. The essential difference between the two fields is that it is possible to do speaker recognition with enrollment and test utterances of very short duration in the text-d...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015